A generative model for feedback networks 
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We investigate a simple generative model for network formation. The model is designed to describe 
the growth of networks of kinship, trading, corporate alliances, or autocatalytic chemical reactions, 
where feedback is an essential element of network growth. The underlying graphs in these situations 
grow via a competition between cycle formation and node addition. After choosing a given node, 
a search is made for another node at a suitable distance. If such a node is found, a link is added 
connecting this to the original node, and increasing the number of cycles in the graph; if such a 
node cannot be found, a new node is added, which is linked to the original node. We simulate 
this algorithm and find that we cannot reject the hypothesis that the empirical degree distribution 
is a g-exponential function, which has been used to model long-range processes in nonequilibrium 
statistical mechanics. 



I. INTRODUCTION 

We present a generative model for constructing net- 
works that grow via competition between cycle formation 
and the addition of new nodes. The algorithm is intended 
to model situations such as trading networks, kinship re- 
lationships, or business alliances, where networks evolve 
by either establishing closer connections by adding links 
to existing nodes or alternatively by adding new nodes. 
In arranging a marriage, for example, parents may at- 
tempt to find a partner within their pre-existing kinship 
network. For reasons such as alliance building and incest 
avoidance, such a partner should ideally be separated by 
a given distance in the kinship network Ij . Such a mar- 
riage establishes a direct tie between families, creating 
new cycles in the kinship network. Alternatively, if they 
do not find an appropriate partner within the existing 
network, they may seek a partner completely outside it, 
thereby adding a new node and expanding it. 

Another motivating example is trading networks 0. 
Suppose two agents (nodes) are linked if they trade di- 
rectly. To avoid the markups of middlemen, and for 
reasons of trust or reliability, an agent may seek new, 
more distant, trading partners. If such a partner is found 
within the existing network a direct link is established. 
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creating a cycle. If not, a new partner is found outside 
the network, a direct link is established, and the network 
grows. A similar story can be told about strategic al- 
liances of businesses Q ; when a business seeks a part- 
ner, that partner should not be too similar to businesses 
with which relationships already exist. Thus the business 
will first take the path of least effort, and seek an appro- 
priate partner within the existing network of businesses 
that it knows; if this is not possible, it may be forced to 
find a partner outside the existing network. 

All of these examples share the common property that 
they involve a competition between a process for creat- 
ing new cycles within the existing network and the ad- 
dition of new nodes to the network. While there has 
been an explosion of work on generative models of graphs 

0, 0, H, 13 1 there has been very little work on net- 
works of this type. The only exception that we are aware 
of involves network models of autocatalytic metabolisms 
0, 0, 0, 01 . Such autocatalytic networks have the 
property that network growth comes about through the 
addition of autocatalytic cycles, which can either involve 
existing chemical species or entirely new chemical species. 
Previous work has focused on topological graph closure 
pro perties jlOl {l"^ , or the simulation of chemical kinetics 
'13 J and was not focused on the statistical properties of 
the graphs themselves. We call graphs of the type that 
we study here feedback networks because the cycles in the 
graph represent a potential for feedback processes, such 
as strengthening the ties of an alliance or chemical feed- 
back that may enhance the concentration corresponding 
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to an existing node I'] . 

We study the degree distributions of the graphs gen- 
erated by our algorithm 0, 0, 0|, and find that they 
are weh-described by distribution functions that have re- 
cently been proposed in noncquilibrium statistical me- 
chanics, more precisely in nonextensive statistical me- 
chanics Such distributions occur in the presence 
of strong correlations, e.g. phenomena with long-range 
interactions. Our intuition for why these distributions 
occur here is that the cycle generation inherently gen- 
erates long range correlations in the construction of the 
graph. 



II. MODEL 

The growth model we propose closely mimics the ex- 
amples given above. For each time step, a starting node 
i is randomly selected (e.g. the person or family looking 
for a marriage partner) and a target node j (the mar- 
riage partner) is searched for within the existing network. 
Node j is not known at the outset but is searched for 
starting at node i. The search proceeds by attempting to 
move through the existing network for some d number of 
steps without retracing the path. If the search is success- 
ful a new link (edge) is drawn from i to j. If the search is 
unsuccessful, as explained below, a new node j' is added 
to the graph and a link is drawn from i to j' . This pro- 
cess can be repeated for an arbitrary number of steps. In 
our simulations, we begin with a single isolated node but 
the initial condition is asymptotically not important. 

For each time step we randomly draw from a scale free 
distribution the starting node i, the distance d (number 
of steps necessary to locate j starting at i assuming that 
such a location does occur) , and for each node along the 
search path, the subsequent neighbor from which to con- 
tinue the search. While node j isn't randomly selected 
at the outset, it is obviously guaranteed that the shortest 
path distance from i to j is at most d. We now describe 
the model in more detail including the method for gen- 
erating search paths, and the criterion for a successful 
search. 

• Selection of node i. The probability Pq, of se- 
lecting a given node from among the N nodes of 
the existing network is proportional to its degree 
raised to a power a. The parameter a > is called 
the attachment parameter. 
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• Assignment of search distance d. An integer 
d is chosen with probability Pp where /3 > 1 is the 
distance decay parameter. ^] . 
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In our experiments, we use the approximation of 
"Y^^^^i for computing the denominator of Eq. 

m 

• Generation of search path. In the search for 
node j, assume that at a given instant the search 
is at node r, where initially r — i. A step of the 
search occurs by randomly choosing a neighbor of 
r, defined as a node I with an edge connecting it to 
r. We do not allow the search to retrace its steps, so 
nodes I that have already been visited are excluded. 
Furthermore, to make the search more efficient, the 
probability of choosing node I is weighted based 
on its unused degree u{l), which is defined as the 
number of neighbors of I that have not yet been 
visited [23|. The probability for selecting a given 
neighbor / is 
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where M is the number of unvisited nearest neigh- 
bors of node r. 7 > is called the routing param- 
eter. If there are no unvisited neighbors of r the 
search is terminated, a new node is created, and 
an edge is drawn between the new node and node 
i. Otherwise this process is repeated up to d steps, 
and a new edge is drawn between node j = I and 
node i. In the first case we call this node creation, 
and in the second case, cycle formation. 



III. RESULTS 

Typical feedback networks with N = 250 for (a,/?, 7) 
of {(0, 1.3, 0), (0, 1.3, 1), (1, 1.3, 0), (1, 1.3, 1)} are shown 
in Figures ^ and [21 The two figures display different 
depictions of the same four graphs. In Figure Q the sizes 
of the nodes represent their degrees and in Figure 12 the 
thickness of the edge is proportional to the number of 
successfully created feedback cycles in which the edge 
participated (i.e. the number of times the search tra- 
versed this edge). 

The attachment parameter a controls the extent to 
which the graph tends to form hubs (highly connected 
nodes). When a — there is no tendency to form hubs, 
whereas when a is large there tend to be fewer hubs. 
As the distance decay parameter /? increases the network 
tends to become denser due to the fact that d is typically 
very small. As 7 increases the search tends to seek out 
nodes with higher connectivity, there is a higher prob- 
ability of successful cycle formation, and the resulting 
graphs tend to be more interconnected and less tree-like. 

Despite that fact that network formation in our model 
depends purely on local information, i.e. each step only 
depends on information about nodes and their nearest 
neighbors, the probability of cycle formation is strongly 
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dependent on the global properties of the graph, which 
evolve as the network is being constructed. In our model 
there is a competition between successful searches, which 
increase the degree of two nodes and leave the number 
of nodes unaltered, and unsuccessful searches, which in- 
crease the degree of an existing node but also create a 
new node with degree one. Successful searches lower 
the mean distance of a node to other nodes, and failed 
searches increase this distance. This has a stabilizing 
effect - a nonzero rate of failed searches is needed to 
increase distances so that future searches can succeed. 
Using this mechanism to grow the network ensures that 
local connectivity structures, in terms of the mean dis- 
tance of a node to other nodes, are somewhat similar 
across nodes thus creating long-range correlations be- 
tween nodes. Because these involve long-range interac- 
tions, we check whether the resulting degree distributions 
can be described by the form 
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where the q- exponential function is defined as 
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if 1 + (1 — q)x > 0, and zero otherwise. This reduces 
to the usual exponential function when (7=1, but when 
g 7^ it asymptotically approaches a power law in the 
limit X oo. When q > 1, the case of interest here, it 
asymptotically decays to zero. The factor po coincides 
with p{0) if and only if 5 = 0; k is a characteristic degree 
number. The g-exponential function arises naturally as 
the solution of the equation dx/dt — a;*, which occurs 
as the leading behavior at some critical points. It has 
also been shown [l3| to arise as the stationary solution 
of a nonlinear Fokker-Planck equation also known as the 
Porous Medium Equation. Various mesoscopic mecha- 
nisms (involving multiplicative noise) have already been 
identified which yield this type of solution [Tsf . 

Finally, the g-exponential distribution also emerges 
from maximizing the entropy 5*^ |15| under a constraint 
that characterizes the number of degrees per node of the 
distribution. Let us briefly recall this derivation. Con- 
sider the entropy 
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where we assume fc as a continuous variable for simplicity, 
and BG stands for Boltzmann- Gibhs. If we extremize Sq 
with the constraints 
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where the Lagrange parameter /3 — l/n is determined 
through Eq. Both constraints (0 and JH)) impose 

q<2. 

Now to arrive at the Ansatz that we have used in 
this paper, we must provide some plausibility to the fac- 
tor fc * in front of the g-exponential. It happens this factor 
is the most frequent form of density of states in condensed 
matter physics (it exactly corresponds to systems of ar- 
bitrary dimensionality whose quantum energy spectrum 
is proportional to an arbitrary power of the wave-vector 
of the particles or quasi-particles; depending on the sys- 
tem, 5 can be positive, negative, or zero, in which case the 
Ansatz reproduces a simple g-exponential) . Such density 
of states concurrently multiplies the Boltzmann-Gibbs 
factor, which is here naturally represented by eq ''^'^ . In 
addition to this, Ansatz provided very satisfactory 
results in financial models where a plausible scale-free 
network basis was given to account for the distribution 
of stock trading volumes [T9|. An interesting financial 
mechanism using multiplicative noise has been recently 
proposed which precisely leads to a stationary state 
distribution of the form It is for this ensemble of 
heuristic reasons that we checked the form (4). The nu- 
merical results that we obtained proved a posteriori that 
this choice was a good one. 

To study the node degree distribution p{k), i.e. the 
frequency with which nodes have fc neighbors, we simu- 
late 10 realizations of networks with N = 5000 for dif- 
ferent values of the parameters a, (3 and 7. Some results 
are shown in Figure 13 We fit g-exponential functions to 
the empirical distributions using the Gauss-Newton algo- 
rithm for nonlinear least-squares estimates of the param- 
eters. Due to limitations of the fitting software we used, 
we had to manually correct the fitting for the tail regions 
of the distribution. In Table ^ we give the parameters 
of the best fits for various values of a, /?, and 7, demon- 
strating that the degree distribution depends on all three 
parameters. The solid curves in Figure |3| represent the 
best fit to a ^-exponential. 

The fits to the g-exponential are extremely good in 
every case. To test the goodness of fit, we performed 
Kolmogorov-Smirnov (KS) and Wilcoxian (W) rank sum 
tests. Due to the fact that the g-exponential is defined 
only on [0, 00), we used a two sample K-S test To 
deal with the problem that the data are very sparse in 
the tail, we excluded data points with sample probability 
less than 10"'*. For the K-S test the null hypothesis is 
never rejected, and for the W test one case out of twelve 
is rejected, with a p value of 0.03. Thus we can conclude 
that there is no evidence that the g-exponential is not 
the correct functional form. 

From Eq. (4) we straightforwardly verify that, in the 
k ^ 00 limit, we obtain (see also Figure O a Pareto 
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TABLE I; Parameters for the best fit to a (/-exponential function for networks with different parameters. The first three columns 
are the parameters of the network model, and the next three columns are the parameters for the fit to the g-exponential. The 
exponent b is defined by 6 = —S (see the text). The last two columns are p- values for nonparametric statistical Kolmogorov- 
Smirnov (K-S) and Wilcoxon rank sum (W) tests. The standard acceptance criterium is to have p > 0.05, i.e., less than one 
failure in twenty. The asterisk depicts the one case where the null hypothesis was rejected. Consequently, if we demand that 
both K-S and W tests are satisfied, we obtained failure in only one among the twelve cases that we have analyzed. 

Network model Fitted parameters p-values for nonparametric tests 
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distribution, of the form afc ^, where a = poC^zi)^^*"' 
and b = — S. This corresponds to scale-free behavior, 
i.e. the distribution remains invariant under the scale 
transformation k Kk. In general, however, scale free 
behavior is only approached asymptotically, and the q- 
generalized exponential distribution, which contains the 
Pareto distribution as a special case, gives a much better 
fit. 

Parameters of model vs. g-exponential. To un- 
derstand how the parameters of the g-exponential de- 
pend on those of the model, we estimated the param- 
eters of the q-exponential for a = {0,0.25,0.5,0.75,1}, 
P = {1.1,1.2,1.3,1.4,1.5} and 7 = {0,0.5,1}. FigureH 
studies the dependence of 5 on the graph parameters, and 
Figure studies the dependence of q and n. 

It is clear that 5 depends solely on the attachment 
parameter a. The other two q-exponential parameters 
{q and k) depend on all three model parameters. The 
parameter n diverges when /3 and 7 grow large and a = 0. 
The q parameter grows rapidly as each of the three model 
parameters increase. 

In Figure 1^ we study the distribution of edge weights, 
where an edge weight is defined as the number of times an 
edge participates in the construction of a feedback cyle 
(i.e. how many times it is traversed during the search 
leading to the creation of the cycle) . From this figure it 
is clear that this property is nearly independent of the 
attachment parameter a, but is strongly depends on the 
routing parameter 7. 



IV. CONCLUSIONS 

In this paper we have presented a generative model 
for creating graphs representing feedback networks. The 
construction algorithm is strictly local, in the sense that 
any given step in the construction of a network only re- 
quires information about the nearest neighbors of nodes. 
Nonetheless, the resulting networks display long-range 
correlations in their structure. This is reflected in the 
fact that the q-exponential distribution, which is associ- 
ated with long-range correlation in problems in statistical 
mechanics, provides a good fit to the degree distribution. 

We think this adds an important contribution to the 
literature on the generation of networks by illustrating 
a mechanism that specifically focuses on the competi- 
tion between consolidation by adding cycles, which rep- 
resent stronger feedback within the network, and growth 
in size by simply adding more nodes. In future work, we 
hope to apply the present model to real networks such 
as biotech intercorporate networks, medieval trading net- 
works, marriage networks, and other real examples. 
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FIG. 1: Representations of typical network models with 250 nodes for /3 = 1.3. The panels correspond to (a) a = 0,7 = 0, (b) 
a = 0,7 = l, (c)q: = 1,7 = and (d) a = 1,7 = 1. Sizes of nodes are proportional to their degrees. In the bottom graphs 
hubs emerge spontaneously due to preferential attachment (q = 1) while on the right more clustering occurs because of the 
larger routing parameter in cycle formation (7 = 1). Notice that the denomination preferential attachment is also used in the 
literature in a slightly different sense, namely when the probability of a new node to attach to a pre-existing one of the growing 
network is proportional to the degree of the pre-existing one. 
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FIG. 2: Representations of typical network models with 250 nodes for /3 = 1.3. The panels correspond to (a) a = 0,7 = 0, (b) 
a = 0, 7 = 1, (c) a = 1, 7 = and (d) a = 1,7 = 1. The thickness of an edge is proportional to the number of successfully 
created feedback cycles in which the edge has participated. The networks on the right of Figs. Q and |21 show clusters of 
connected hubs with well-traversed routes around the clusters, while in those on the left, more tree-like, hubs connect but not 
in clusters with well-traversed routes around them. 
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FIG. 3: Degree distributions and fits to a g-exponential for simulations of networks with A'^ = 5000 and 10 realizations. The dots 
correspond to the empirically observed frequency of each degree; the lowest row of dots in each case corresponds to observing 
one node with that degree. The solid curves represent the best fit to a g-exponential. In each case a has the three values 
{0, 0.5, 1}, corresponding to black, blue and red respectively, (a) (3 = 1.2, 7 = 0; (b) (3 — 1.2, 7 = 1; (c) /3 = 1.4, 7 = and (d) 
/3 — 1.4, 7 = 1. Note that the scale of the x-axes changes. The parameters of the fitted generalized g-exponential functions are 
given in Tabled] 



9 



Dependence of 5 parameter 



d 



« 














o p = 1 1 , y = 








A p=1. 1^7=0.5 








+ fi=1. 1.7=1 








X (5=1.3,7=0 




« 




(5=1.3,7=0.5 




o 




V P=1.3,7=1 








K (5 = 1.5, Y = 








* P = 1.5^7= 0.5 






• 


* P=1.5,Y=1 








* 








St 








* 


0.0 


0.2 


0.4 0.6 


0.8 1.0 



a 



FIG. 4: Dependence of the q-exponential parameter 5 on the network parameters a, /?, and 7. 



10 





FIG. 5: Dependencies of q-exponential parameters q and k that were fitted to network models. 
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FIG. 6: Distribution of edge weights. Edge weights represent the number of successfully created feedback cycles in which an 
edge participated. The parameter P — 1.3, but a and 7 vary. These calculations are based on 100 realizations of networks 
growing to A'' = 500. The edge weights distribution experiences only a slight change to the right when increasing distance decay 
parameter /3 while varying a but keeping 7 constant. 



